Binaural deep neural network classification for reverberant speech segregation
نویسندگان
چکیده
While human listening is robust in complex auditory scenes, current speech segregation algorithms do not perform well in noisy and reverberant environments. This paper addresses the robustness in binaural speech segregation by employing binary classification based on deep neural networks (DNNs). We systematically examine DNN generalization to untrained configurations. Evaluations and comparisons show that DNN based binaural classification produces superior segregation performance in a variety of multisource and reverberant conditions.
منابع مشابه
Binaural Reverberant Speech Separation Based on Deep Neural Networks
Supervised learning has exhibited great potential for speech separation in recent years. In this paper, we focus on separating target speech in reverberant conditions from binaural inputs using supervised learning. Specifically, deep neural network (DNN) is constructed to map from both spectral and spatial features to a training target. For spectral features extraction, we first convert binaura...
متن کاملIdeal Ratio Mask Estimation Using Deep Neural Networks for Monaural Speech Segregation in Noisy Reverberant Conditions
Monaural speech segregation is an important problem in robust speech processing and has been formulated as a supervised learning problem. In supervised learning methods, the ideal binary mask (IBM) is usually used as the target because of its simplicity and large speech intelligibility gains. Recently, the ideal ratio mask (IRM) has been found to improve the speech quality over the IBM. However...
متن کاملIntegrating Monaural and Binaural Cues for Sound Localization and Segregation in Reverberant Environments
The problem of segregating a sound source of interest from an acoustic background has been extensively studied due to applications in hearing prostheses, robust speech/speaker recognition and audio information retrieval. Computational auditory scene analysis (CASA) approaches the segregation problem by utilizing grouping cues involved in the perceptual organization of sound by human listeners. ...
متن کاملSpeech Segregation based on Binary Classification
Speech segregation is a fundamental challenge in speech and audio processing. This AFOSR project aimed to develop a speech segregation system that can potentially improve speech intelligibility in noise for human listeners. Motivated by the perceptual principles of auditory scene analysis and the speech intelligibility studies of ideal time-frequency masking, the project sought to develop a cla...
متن کاملBinaural sub-band adaptive speech enhancement using artificial neural networks
© Speech Communication, Vol.25, Elsevier Science B.V., Amsterdam, The Netherlands, pp.177-186, 1998 3 In this paper, a general class of “single-hidden layered, feedforward” Artificial Neural Network (ANN) based adaptive non-linear filters is proposed for processing band-limited signals in a multimicrophone sub-band adaptive speech-enhancement scheme. Initial comparative results achieved in simu...
متن کامل